Parallel Reference Speaker Weighting for Kinematic-Independent Acoustic-to-Articulatory Inversion
نویسندگان
چکیده
منابع مشابه
Vocal Tract Length Normalization for Speaker Independent Acoustic-to-Articulatory Speech Inversion
Speech inversion is a well-known ill-posed problem and addition of speaker differences typically makes it even harder. This paper investigates a vocal tract length normalization (VTLN) technique to transform the acoustic space of different speakers to a target speaker space such that speaker specific details are minimized. The speaker normalized features are then used to train a feed-forward ne...
متن کاملAcoustic to articulatory inversion
The context of this work is speech analysis. The subject deals with acoustic-to-articulatory inversion, i.e. the recovery of the temporal evolution of the vocal tract shape from the signal. This topic is important because it is likely to give rise to applications in the domains of speech coding as well as second language learning. Acoustic-to-articulatory inversion relies on an analysis by synt...
متن کاملFormant trajectories for acoustic-to-articulatory inversion
This work examines the utility of formant frequencies and their energies in acoustic-to-articulatory inversion. For this purpose, formant frequencies and formant spectral amplitudes are automatically estimated from audio, and are treated as observations for the purpose of estimating electromagnetic articulography (EMA) coil positions. A mixture Gaussian regression model with mel-frequency cepst...
متن کاملJerk Minimization for Acoustic-To-Articulatory Inversion
The effortless speech production in humans requires coordinated movements of the articulators such as lips, tongue, jaw, velum, etc. Therefore, measured trajectories obtained are smooth and slowly-varying. However, the trajectories estimated from acoustic-to-articulatory inversion (AAI) are found to be jagged. Thus, energy minimization is used as smoothness constraint for improving performance ...
متن کاملInformation theoretic acoustic feature selection for acoustic-to-articulatory inversion
We use mutual information as the criterion to rank the Mel frequency cepstral coefficients (MFCCs) and their derivatives according to the information they provide about different articulatory features in acoustic-to-articulatory (AtoA) inversion. It is found that just a small subset of the coefficients encodes maximal information about articulatory features and interestingly, this subset is art...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM Transactions on Audio, Speech, and Language Processing
سال: 2016
ISSN: 2329-9290,2329-9304
DOI: 10.1109/taslp.2016.2588340